Divide County
AI-Driven Expansion and Application of the Alexandria Database
Cavignac, Théo, Schmidt, Jonathan, De Breuck, Pierre-Paul, Loew, Antoine, Cerqueira, Tiago F. T., Wang, Hai-Chen, Bochkarev, Anton, Lysogorskiy, Yury, Romero, Aldo H., Drautz, Ralf, Botti, Silvana, Marques, Miguel A. L.
We present a novel multi-stage workflow for computational materials discovery that achieves a 99% success rate in identifying compounds within 100 meV/atom of thermodynamic stability, with a threefold improvement over previous approaches. By combining the Matra-Genoa generative model, Orb-v2 universal machine learning interatomic potential, and ALIGNN graph neural network for energy prediction, we generated 119 million candidate structures and added 1.3 million DFT-validated compounds to the ALEXANDRIA database, including 74 thousand new stable materials. The expanded ALEXANDRIA database now contains 5.8 million structures with 175 thousand compounds on the convex hull. Predicted structural disorder rates (37-43%) match experimental databases, unlike other recent AI-generated datasets. Analysis reveals fundamental patterns in space group distributions, coordination environments, and phase stability networks, including sub-linear scaling of convex hull connectivity. We release the complete dataset, including sAlex25 with 14 million out-of-equilibrium structures containing forces and stresses for training universal force fields. We demonstrate that fine-tuning a GRACE model on this data improves benchmark accuracy. All data, models, and workflows are freely available under Creative Commons licenses.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- North America > United States > West Virginia > Monongalia County > Morgantown (0.04)
- (3 more...)
- Workflow (1.00)
- Research Report (1.00)
- Materials > Chemicals (0.46)
- Government (0.46)
Machine Learning for Sustainable Rice Production: Region-Scale Monitoring of Water-Saving Practices in Punjab, India
Shah, Ando, Singh, Rajveer, Zaytar, Akram, Tadesse, Girmaw Abebe, Robinson, Caleb, Tafti, Negar, Wood, Stephen A., Dodhia, Rahul, Ferres, Juan M. Lavista
In regions like Punjab, India, where groundwater levels are plummeting at 41.6 cm/year, adopting water-saving rice farming practices is critical. Direct-Seeded Rice (DSR) and Alternate Wetting and Drying (A WD) can cut irrigation water use by 20-40% without hurting yields, yet lack of spatial data on adoption impedes effective adaptation policy and climate action. We present a machine learning framework to bridge this data gap by monitoring sustainable rice farming at scale. In collaboration with agronomy experts and a large-scale farmer training program, we obtained ground-truth data from 1,400 fields across Punjab. Leveraging this partnership, we developed a novel dimensional classification approach that decouples sowing and irrigation practices, achieving F1 scores of 0.8 and 0.74 respectively, solely employing Sentinel-1 satellite imagery. Explainability analysis reveals that DSR classification is robust while A WD classification depends primarily on planting schedule differences, as Sentinel-1's 12-day revisit frequency cannot capture the higher frequency irrigation cycles characteristic of A WD practices. Applying this model across 3 million fields reveals spatial heterogeneity in adoption at the state level, highlighting gaps and opportunities for policy targeting. Our district-level adoption rates correlate well with government estimates (Spearman's ρ=0.69 and Rank Biased Overlap=0.77). This study provides policymakers and sustainability programs a powerful tool to track practice adoption, inform targeted interventions, and drive data-driven policies for water conservation and climate mitigation at regional scale.
Actionable Counterfactual Explanations Using Bayesian Networks and Path Planning with Applications to Environmental Quality Improvement
Valero-Leal, Enrique, Larrañaga, Pedro, Bielza, Concha
Counterfactual explanations study what should have changed in order to get an alternative result, enabling end-users to understand machine learning mechanisms with counterexamples. Actionability is defined as the ability to transform the original case to be explained into a counterfactual one. We develop a method for actionable counterfactual explanations that, unlike predecessors, does not directly leverage training data. Rather, data is only used to learn a density estimator, creating a search landscape in which to apply path planning algorithms to solve the problem and masking the endogenous data, which can be sensitive or private. We put special focus on estimating the data density using Bayesian networks, demonstrating how their enhanced interpretability is useful in high-stakes scenarios in which fairness is raising concern. Using a synthetic benchmark comprised of 15 datasets, our proposal finds more actionable and simpler counterfactuals than the current state-of-the-art algorithms. We also test our algorithm with a real-world Environmental Protection Agency dataset, facilitating a more efficient and equitable study of policies to improve the quality of life in United States of America counties. Our proposal captures the interaction of variables, ensuring equity in decisions, as policies to improve certain domains of study (air, water quality, etc.) can be detrimental in others. In particular, the sociodemographic domain is often involved, where we find important variables related to the ongoing housing crisis that can potentially have a severe negative impact on communities.
- Europe > Spain > Galicia > Madrid (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
- Health & Medicine (1.00)
- Law (0.87)
- Banking & Finance > Real Estate (0.66)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
AutoML-based Almond Yield Prediction and Projection in California
Duan, Shiheng, Wu, Shuaiqi, Monier, Erwan, Ullrich, Paul
Almonds are one of the most lucrative products of California, but are also among the most sensitive to climate change. In order to better understand the relationship between climatic factors and almond yield, an automated machine learning framework is used to build a collection of machine learning models. The prediction skill is assessed using historical records. Future projections are derived using 17 downscaled climate outputs. The ensemble mean projection displays almond yield changes under two different climate scenarios, along with two technology development scenarios, where the role of technology development is highlighted. The mean projections and distributions provide insightful results to stakeholders and can be utilized by policymakers for climate adaptation.
- North America > United States > California > Yolo County > Davis (0.16)
- North America > United States > North Dakota > Divide County (0.04)
- North America > United States > California > Alameda County > Livermore (0.04)
- (2 more...)
Attributed Network Embedding Model for Exposing COVID-19 Spread Trajectory Archetypes
Ma, Junwei, Li, Bo, Li, Qingchun, Fan, Chao, Mostafavi, Ali
The spread of COVID-19 revealed that transmission risk patterns are not homogenous across different cities and communities, and various heterogeneous features can influence the spread trajectories. Hence, for predictive pandemic monitoring, it is essential to explore latent heterogeneous features in cities and communities that distinguish their specific pandemic spread trajectories. To this end, this study creates a network embedding model capturing cross-county visitation networks, as well as heterogeneous features to uncover clusters of counties in the United States based on their pandemic spread transmission trajectories. We collected and computed location intelligence features from 2,787 counties from March 3 to June 29, 2020 (initial wave). Second, we constructed a human visitation network, which incorporated county features as node attributes, and visits between counties as network edges. Our attributed network embeddings approach integrates both typological characteristics of the cross-county visitation network, as well as heterogeneous features. We conducted clustering analysis on the attributed network embeddings to reveal four archetypes of spread risk trajectories corresponding to four clusters of counties. Subsequently, we identified four features as important features underlying the distinctive transmission risk patterns among the archetypes. The attributed network embedding approach and the findings identify and explain the non-homogenous pandemic risk trajectories across counties for predictive pandemic monitoring. The study also contributes to data-driven and deep learning-based approaches for pandemic analytics to complement the standard epidemiological models for policy analysis in pandemics.
- North America > United States > Arkansas > Cross County (0.46)
- North America > United States > Texas > Brazos County > College Station (0.14)
- South America > Brazil (0.04)
- (12 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Multi-Year Vector Dynamic Time Warping Based Crop Mapping
Teke, Mustafa, Yardımcı, Yasemin
Abstract: Recent automated crop mapping via supervised le arning - based methods have demonstrated unprecedented improvement over classical techniques. However, m ost crop mapping studies are limited to same - year crop mapping in which the present year's labeled data is used to predict the same year's crop map. Cross - y ear crop mapping is more useful as it allows the prediction of the following years' crop maps using previously labeled data. We propose Vector Dynamic Time Warping ( VD TW), a novel multi - year classification approach based on warping of angular distances between phenological vectors. The results prove that the proposed VDTW method is robust to temporal and spectral v ariations compensating for different farming practices, climate and atmospheric effects, and measurement errors between years. We also describe a method for determining the most discriminative time window that allows high classification accuracies with lim ited data. We carried out test s of our approach with Lan dsat 8 time - series imagery from years 2013 to 2016 for classification of corn and cotton in the Harran Plain, and corn, cotton, and soybean in the Bismil Plain of Southeastern Turkey. In addition, we tested VDTW corn and soybean in Kansas, the US for 2017 and 2018 with the Harmonized Landsat Sentinel data . The VDTW method achieved 99.85% and 99.74% overall accuracies for the same and cross years, respectively with fewer training samples compared to oth er state - of - the - art approaches, i.e. spectral angle mapp er ( SAM), dynamic time warping ( DTW), time - weighted DTW ( TWDTW), random forest (RF), support vector machine ( SVM) and deep long short - term memory ( LSTM) methods. The proposed method could be expanded for other crop types and/or geographical areas. Keywords: Time series; phenology; multi - year classification; dynamic programming; Landsat; crop mapping; land use; corn; cotton; soybean 1. Introduction T he world population is expected to exceed nine billion in 2050 [1] . Providing adequate nutrition for the increasing human population is a significant concern. Advanced agri cultural technologies, such as precision agriculture and precision irrigation are rapidly emerging to optimize water, fertilizers, and pesticides; thereby enabling higher crop yield. Accurate crop maps are the first requirements of advanced agriculture app lications such as yield forecasting . Early - season crop yield estimates are a crucial factor for food security and monitor ing agricultural subventio ns. Crop maps are also an essential tool for statistical purposes to analyze annual changes in agricultural p roduction. However, there are a variety of field crops with similar phenologies and spectral signatures.
- Asia > China (0.14)
- North America > United States > Montana (0.14)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- (14 more...)